Counter-stereotypical messaging and partisan cues: Moving the needle on vaccines in a polarized United States

This paper reports results from a large-scale randomized controlled trial assessing whether counter-stereotypical messaging and partisan cues can induce people to get COVID-19 vaccines. The study used a 27-s video compilation of Donald Trump’s comments about the vaccine from Fox News interviews and presented the video to millions of U.S. YouTube users through a $100,000 advertising campaign in October 2021. Results indicate that the number of vaccines increased in the average treated county by 103 (with a one-tailed P value of 0.097). Based on this average treatment effect and totaling across our 1014 treated counties, the total estimated effect was 104,036 vaccines.


A: Robustness Checks for Core Results
We consider two robustness checks for our main results. During our sample period, two types of obvious mis-recordings occur in the CDC county-level daily data. The first is that some counties show a decrease from one day to the next in their cumulative vaccination count. For example, the data can show that a given county has administered a total of 34,500 COVID-19 vaccine first doses since the beginning of the vaccine's availability, up through and including date t, and then show that this number decreases to 33,300 on the following day, which is impossible. The second type of error is simply that the vaccine count is missing for some dates during our sample for certain counties. We replicate our main analysis dropping any counties with misrecorded CDC data. The results are shown in Table S1.
Columns 1-2 replicate the ITT analysis of Table 2 but drop counties that have reported decreases in cumulative vaccine counts, and columns 5-6 instead drop counties that have missing vaccine counts for any date in the sample period. Table S1 shows that the effects are positive, as in Table 2, and slightly larger in magnitude. The results in columns 1-2 have a similar level of statistical significance to those in Table 2, although columns 5-6 are no longer significant at the 0.10 level.
In Table S1, columns 3-4, we report the estimated ACR after dropping from our sample any counties with misrecorded CDC data. Here we observe an effect that is similar in magnitude and significance. In columns 7-8, where we drop counties with CDC data missing for at least one date, the estimated ACR is similar in magnitude but no longer significant.
We also repeat our instrumental variables regressions from Table 1 using a heterogeneous firststage regression. For this analysis, we create five indicator variables for whether a county is above or below the median in terms of five characteristics: percent Trump voters, percent college educated, percent white, percent with internet access, and county population. As shown in Table S2, these five county-level characteristics are all significantly correlated with the number of ads a county receives, making them candidates for a heterogeneous first stage regression. We group counties into 32 (2^5) different groups based on the realizations of these five indicator variables and interact these 32 variables with × . We then repeat the instrumental variables regression of equation (2) using these 32 variables as excluded instruments for × rather than only using × as the excluded instrument. As additional included instruments (included in the first and second stage) we use the interactions of with the indicators for below median Trump voters, percent college educated, percent white, and percent with internet access. Note that the interaction of with county population is already included (linearly). The results of this IV regression with a heterogeneous first stage are shown in Table S12. We find similar estimates to those in Table 1, with an average causal response of about 10.9 vaccines to an increase of 1,000 ads, regardless of how we control for differential trends across population. Figure S1 contains an alternative to our event study analysis, where we use as the outcome the number of vaccines administered on a given date in a given county (rather than the cumulative number of vaccines so far in the county). These daily level results are more noisily estimated than the cumulative results. However, consistent with Figure 3.A, Figure S1.A shows that differences between treatment and control counties prior to the start of the campaign were small and largely insignificant. We observe large point estimates later in the campaign (in the last week of October), and then a leveling off at about 10 vaccines per day in the final days of the campaign and the two weeks afterward.

B: Day-focused Event Study Analysis
Upon investigation, we learned that the large spike on October 29 seen in Figure S1.A is driven by counties with misrecorded CDC data, in which the county records a decrease in its cumulative vaccination count over time, as discussed in SM Section A. Figure S1.D omits these counties, and the October 29 spike disappears. We also report all of our event study specifications with state-(rather than county-) and-date level clustering, and display the results in Figure S2. Here we see a similar pattern to that in Figures 3 and S1 but with tighter confidence intervals.

C: Departures from Pre-Registration Plan
We pre-registered our analysis plan via the Open Science Framework at https://osf.io/m9yhn/?view_only=c0d43e87224649e88b671eafddb22df8. Our analysis described in the body of the paper follows this pre-registered plan to the extent possible. Specifically, our pre-registered dependent variable is the number of vaccines administered in each county up through a given date. Our pre-registration plan also stated that we would analyze effects of our campaign through difference-in-difference OLS regressions, and we follow this plan throughout.
The plan explained that we would estimate treatment effects on a sample of dates ranging from 14 days before to five days after the campaign, which we refer to here as our restricted sample. We also pre-registered an intention to explore wider date ranges, given uncertainty about how quickly treatment effects would onset.

Controlling for Differential Trends by Population Size
After the campaign ran, however, we learned an unanticipated feature of the data that necessitated a modification of our analysis: including controls for differential growth rates across counties of different sizes. During the period of our study, county-level growth rates in vaccination counts were very different in counties of different population sizes, leading us to include controls for differential growth rates in counties of different sizes. To illustrate this differential growth, we first estimate a version of equation (1) without including the × term. The results are shown in column 1 of Table S11, where we observe a small point estimate (9.8) that is very imprecisely measured. The 95% confidence interval contains our preferred estimate from the body of the paper, 102.6 (from column 1 of Table 1). We then estimate a version of equation (1) without the × interaction but including the × term. As in equation (1), the main effect of is absorbed by county-level effects . The results are shown column 2 of Table S11.
We find a statistically significant and positive coefficient on the × term, implying that a county with 10,000 more residents has 275 more vaccinations in the post period. This increase is entirely independent of our experiment, as the results hold across all counties (column 2) and even within control counties alone (column 3). This suggests that, by not controlling for differential growth in vaccines in counties of different sizes, the specification in column 1 leaves a significant amount of statistical noise uncontrolled for. When we include the × interaction, we obtain the effect of 102.6 reported in column 1 of Table 1.
An additional, minor departure from our pre-registration plan is the following: the plan described omitting the campaign start date (October 14) from our analysis. We replicated our analysis with and without this date and found that the qualitative and quantitative findings of the study were unchanged.
Though our primary analyses deviate from our pre-registered plan, we believe that our revised approach in analyses are all appropriate responses to changes in the research environment that could not be sufficiently anticipated at the time of the pre-registration. We believe, nonetheless, that they are in keeping with the spirit of our pre-registration plan and, under the circumstances, provide the most appropriate tools to assess the causal impact of our advertising campaign on vaccine uptake in the targeted counties.

Examining Different Date Windows and County-level Regressions
As outlined in our pre-registration plan, we estimated treatment effect first on a restricted sample of dates and then moved to wider date ranges. Therefore, exploring different date windows was not a departure from our pre-registration plan. We illustrate these results here to compare the specification with and without population controls under different time windows. The restricted sample of dates is a window from 14 days before to 5 days after the campaign, totaling to 72,815 county-date observations. Estimates of the intent-to-treat effect on this restricted sample are shown in columns 4-5 of Table S11, with column 4 omitting the × interaction and column 5 including it. The results are too imprecisely measured to detect a significant effect in either column, but in the latter the 95% confidence interval contains our preferred estimate from column 1 of Table 1. The final sample we chose to focus on is a wider date range, including dates from one month before the campaign to one month after, which we refer to as our full sample (the 151,945 county-date observations). We arrived at this window after exploring the event study presented in the Results section. This event study clearly reveals that the ad campaign affected vaccine counts with a lag, underlining the importance of allowing for a wider date range. The Results section discusses several possible sources for this lag.
We now present an alternative regression design that suggests our estimates are not overly sensitive to either the inclusion of population controls or the time window. In this alternative framework, we create a dataset with one observation per county, estimating the following regression: where is the number of vaccines in county i on a particular date after the campaign (we examine both November 5 and 30) and is the number on a particular date prior to the campaign (we examine both September 30 and 15). In this regression, captures the intentto-treat effect by controlling for the level of vaccines in the pre period and estimating the increase in vaccines due to treatment assignment. Note that, as this regression only includes one observation per county, the need to cluster disappears and heteroskedasticity-robust standard errors are appropriate.
The results of this regression are shown in columns 1 and 3 of Table S12. We find a point estimate of 97 in column 1 when we use the narrow date span (Sep. 30 to Nov. 5) and 114 in column 3 when we use the wider date span (Sep. 15 to Nov. 30). Regardless of the date window, the estimates are similar in size to our main ITT estimate of 103. However, these effects are less precisely estimated than the estimates from the county-by-date panel dataset we use for our main analysis. In columns 2 and 4 we also control for the population of county i. This mirrors the inclusion of the × interaction in our main panel analysis, allowing for the possibility that vaccine counts increase by more in large counties than in small counties. However, unlike our panel analysis, here these population controls make little difference (the point estimates in columns 2 and 4 are similar to those in columns 1 and 3) because already captures much of the difference in population across counties.
In columns 5-8 of Table S12, we modify regression (6) by omitting from the righthand side and changing the outcome to ( − ). In columns 6 and 8 we also include population as a control. In this case, the inclusion of population controls does matter.
This is because the left-hand side is a difference and, without controlling for population, nothing in the regression accounts for the dramatic difference in vaccine growth between large and small counties, analogous to how nothing in regression (1) accounts for this differential if the × interaction is omitted. Without controlling for population, the point estimates in columns 5 and 7 of Table S12 are quite different from those in columns 1-4. After controlling for population, the results in columns 6 and 8 are essentially the same as those in columns 1-4, and, again, quite similar to our main estimated effect of 103 vaccines. Finally, as in columns 1-4, the estimates in columns 6-8 do not change drastically as we change the date window.
While this alternative regression framework yields point estimates that are similar to our main results, we prefer the panel approach described in the body of the paper as it yields more precise estimates, using information from all dates rather than just one date before and one date after the campaign.

D: Survey Instrumentation
We contracted with Qualtrics to gather six different 2,400 respondent samples at regular intervals  Columns 1-4 report regression results from the same specifications as in Table 2 but using only counties in which CDC records do not show a decrease in the cumulative vaccination count for any date. Columns 5-8 report results as in Table  2 but using only counties in which CDC records are not missing for any date. ITT refers to intent-to-treat and ACR to average causal response. "***", "**", and "*" indicate significance (from a one-tailed test) at the 0.01, 0.05, and 0.10 levels. Standard errors, reported in parentheses below each estimate, are clustered at the county level. Randomization inference p-values are from a one-tailed test based on 1,000 permutations using the treatment effect -(Treat x Post) in ITT columns and (Ads per 100 x Post) in ACR columns -as the randomization test statistic.    Results corresponding to those in Table 1 columns 3-4 but replacing the homogenous first stage with a heterogeneous first stage regression. Sample size is smaller than in Table 1 because percent Trump voters and percent of households with internet access are missing for some counties. All regressions include fixed effects at the county and date levels and interactions of the county-characteristics indicators with the Post dummy, as described in Section A of the supplemental materials. "***", "**", and "*" indicate significance (from a onetailed test) at the 0.01, 0.05, and 0.10 levels. Standard errors, reported in parentheses below each estimate, are clustered at the county level. Full regression estimates corresponding to results from Table 2. All regressions include fixed effects at the county and date levels. Columns 1-2 correspond to regression (1) and columns 3-4 correspond to regression (2). Columns 2 and 4 replaces the × interaction with interactions of county population with (i) dummies for each date within two weeks before to two weeks after the campaign (omitting the date before the campaign started), (ii) a dummy variable for two weeks or more before, and (iii) a dummy variable for two weeks or more after. "***", "**", and "*" indicate significance (from a one-tailed test) at the 0.01, 0.05, and 0.10 levels. Standard errors, reported in parentheses below each estimate, are clustered at the county level. Randomization inference p-values are from a one-tailed test based on 1,000 permutations using the ITT effect (Treat x Post) as the randomization test statistic.  (1) and (2) where the dependent variable is the total percent of the county population vaccinated at a given point in time and the treatment intensity is measured as the number of ads a county receives per 100 residents. The number of observations is slightly higher in this table than in our main analysis (163,856 county-date observations rather than 151,945) because, for some observations, the vaccination count is missing on certain dates in the CDC data even though the vaccination rate is recorded. "***", "**", and "*" indicate significance (from a one-tailed test) at the 0.  Full regression estimates from regressions in Table 3. Odd columns report results from regression (3)  . W is an indicator for whether the value of a given countylevel characteristic is below the median of that characteristic across counties in our sample. This characteristic is the 2016 Trump vote share in columns 1-2, the fraction of county residents with a college degree in columns 3-4, and the fraction of county residents who are white in columns 5-6. "***", "**", and "*" indicate significance (from a two-tailed test) at the 0.01, 0.05, and 0.10 levels. The "Effect at W=1" row displays the sum of the coefficients on ( × × ) and ( × ) in odd columns and the sum of ( × ) and ( × × ) in even columns.
Standard errors, reported in parentheses below each estimate, are clustered at the county level. Randomization inference p-values are from a two-tailed test based on 1,000 permutations using the ITT effect in low-relative-to-high counties ( × × ) as the randomization test statistic.  Table is analogous to Table S5 but here each regression uses the vaccine rate (the fraction of the county vaccinated) rather than the level as the dependent variable. APH stands for "Ads per 100 residents." Odd columns report results from a regression analogous to (3) and even columns report an IV version of this regression, where we instrument for . W is an indicator for whether the value of a given county-level characteristic is below the median of that characteristic across counties in our sample. This characteristic is the 2016 Trump vote share in columns 1-2, the fraction of county residents with a college degree in columns 3-4, and the fraction of county residents who are white in columns 5-6. "***", "**", and "*" indicate significance (from a two-tailed test) at the 0.01, 0.05, and 0.10 levels. The "Effect at W=1" row displays the sum of the coefficients on (  indicates users for which Google does not know a given characteristic.  Regression results corresponding to Table 2 with alternative standard error estimates, including clustering at the county level (2,032 counties), state level (43 states), or stratum level (20 strata). Heteroskedasticity-robust standard errors use no clusters. The final three rows of standard errors use two-way clustering at the geographical level and at the data level.
County clustering corresponds to the standard error estimates reported in Table 2. "***", "**", and "*" indicate significance (from a one-tailed test) at the 0.01, 0.05, and 0.10 levels. Regression results. Columns 1-2 use the full sample, column 3 uses only control counties, and columns 4-5 use the restricted sample period (14 days before to five days after the campaign). All regressions include fixed effects at the county and date levels. Column 1 runs a version of regression (1) without the × interaction, whereas columns 2-3 run regression (1) without the × interaction. Columns 4 repeats the specification of column 1, and column 5 repeats the specification of Table 2 column 1, but on the restricted sample of dates. "***", "**", and "*" indicate significance (from a one-tailed test) at the 0.01, 0.05, and 0.10 levels. Standard errors, reported in parentheses below each estimate, are clustered at the county level. Randomization inference p-values are from a one-tailed test based on 1,000 permutations using the ITT effect (Treat x Post) as the randomization test statistic.  corresponding to Pre and Post in each column are shown in the last two rows. The number of observations is slightly smaller for the wider date window because some counties have missing observations on those dates. "***", "**", and "*" indicate significance (from a one-tailed test) at the 0.01, 0.05, and 0.10 levels. Heteroskedasticity-robust standard errors are reported in parentheses below each estimate. Randomization inference p-values are from a one-tailed test based on 1,000 permutations using the ITT effect (Treat) as the randomization test statistic.   (4), the effect on the cumulative vaccine count up through a given date. Panels C and D display results from regression (5), the effect on the number of vaccines administered on a given date. Panels on the left use the full sample and those on the right drop counties that ever record a decrease in cumulative vaccine count over time. Shaded region in panels A and B, and red dashed lines in panels C and D, representing pointwise 95% confidence intervals computed under two-way clustering at the state level and date level.

Figure S3: Screenshot Example of a YouTube Segment to which Trump Ad Was Attached
Example of Fox News segment before which our ad was shown (our ad appeared before this segment 2,740 times).